home *** CD-ROM | disk | FTP | other *** search
-
-
-
- 268
-
- CHAPTER 25 - WHAT DOES IT ALL MEAN?
-
-
- What does it all mean? Not a whole lot, actually. We can now save
- memory space and we can save a lot of time. But with the
- incessant march of technology these things mean less and less. A
- few years ago, when 64k or 128k was a lot of memory and memory
- was expensive, having a 20k program instead of a 40k program was
- a significant advantage. Now it means almost nothing unless it is
- a memory resident program. What about disk space? Just a while
- back we were operating with two 360k floppy disks and a hard disk
- was too expensive. Nowdays everyone has a 20meg hard disk.{1} And
- speed? Those programs that were slow on an 8088 now seem o.k. on
- an 80386. Those programs that were unbearably slow on the 8088
- died a quick death and are no longer around.
-
- Compilers are better and they have more subroutines available.
- They are also easier to program than going to the assembler
- level. What this chapter is about is when NOT to use the standard
- compiler functions and subroutines.
-
- First, you should understand that all compiler subroutines are
- general purpose subroutines. They need to be all things to all
- people. Imagine what a vehicle would be like if we gave the
- designer the following specification:
-
- We want to be able to drive to the store for groceries. It
- should be fuel efficient. In case we want to go into the
- mountains it should be an all terrain vehicle. We also want
- to be able to haul a roomful of furniture from coast to
- coast. Oh yes, and we want to be able to race it at Le Mans.
-
- Being universal requires a lot of code and it slows things down.
- Whether this extra code and time is too much is a question you
- need to decide for yourself. First, here are some examples of
- size. This is a C program that does almost nothing:
-
- #include <stdio.h>
- main()
- {
- int x ;
- x = 27 ; /* line 1 */
- scanf ( "%d", &x ) ; /* line 2 */
- printf ( "%d\n", x ) ; /* line 3 */
- }
- ____________________
-
- 1. Which has led to one of my pet peeves. All installation
- programs for compilers and word processors dump EVERYTHING on the
- hard disk. This gives us subdirectories that have 50 files in
- them, and we don't have the foggiest notion of what any of the
- files are for. If these installation programs would only prompt
- us by type of file to find out what we want to install and want
- to leave off the hard disk, we would all be better off.
-
- ______________________
-
- The PC Assembler Tutor - Copyright (C) 1990 Chuck Nelson
-
-
-
-
- Chapter 25 - What Does It All Mean? 269
- ___________________________________
-
-
- I have made 3 programs from this. LINE1.C has line1 only. LINE2.C
- has lines 1 and 2. LINE3.C has lines 1, 2 and 3. For those non C
- people, scanf is an input function, printf is an output function.
- Guess how big each program is. Here's the directory listing.
-
- LINE1 EXE 3176 6-22-90 8:48a
- LINE2 EXE 7170 6-22-90 8:49a
- LINE3 EXE 9134 6-22-90 8:49a
-
- It takes 3000 bytes to start a C program (this is the startup
- module) 4000 bytes more to enter something and 2000 bytes extra
- to print something. That first 3000 bytes is unavoidable if you
- are writing in a high level language. If you are doing a lot of
- general purpose i/o, these extra amounts aren't too bad. There
- are two cases where you might want to use your own i/o routines.
- First, if you have something simple or secondly, if you have
- something special, you want to do your own i/o.
-
- If you don't need all that flexibility, you are better off doing
- your own i/o. Here are two files that write a text screen to a
- disk file.
-
- COPYSCRN EXE 10454 6-10-90 9:30a
- INTSCRN COM 445 6-12-90 7:40p
-
- They both do the same thing except that the .COM file is a little
- more sophisticated. Notice the difference in size. Speed really
- doesn't play a part here because what they do is so simple that
- it takes just a second in any case. The program was so simple
- that it only took an hour or two to write, so I didn't lose any
- time by writing it in assembler.
-
- The other case is when you have a specific idea of what the
- screen should look like. You want control of the whole screen all
- the time. This includes all word processors, databases,
- programming environments, etc. They all take charge of the screen
- because some DOS functions are too slow. If you remember from the
- ZOOM chapter, there is a radical difference between what you can
- do and what DOS can do. Even though these large programs are
- written in C, they all bypass the C i/o functions. That does not
- mean that they go down to the assembler level, however.
-
-
- INTERRUPTS
-
- You have done a few interrupts. They call the standard DOS or
- BIOS functions. Remember, they do this by going into low memory
- (the first 1k of memory) and getting the address of the
- subprogram that handles that particular interrupt. However, you
- do not need to call these interrupts from the assembler level.
- All modern compilers support interrupt calls in the language. If
- yours doesn't, you need a more recent compiler. Before going on
- with this chapter you need to read your compiler documentation
- about interrupts. TURBO Pascal has INTR, QuickBASIC and QuickC
- have INT86. Read the documentation now.
-
-
-
-
-
- The PC Assembler Tutor 270
- ______________________
-
- Have you read it? No cheating is allowed, because you won't
- understand the rest of this if you haven't read it.
-
- Though technically C is a structure and Pascal is a record, they
- are actually arrays where each array element has a specific name.
- The interrupt routine reads all these values into the
- corresponding register, calls the interrupt, then reads the
- register values back into the array. Some languages have one
- array for the input and another for the output. Int 21h is
- special so QuickC has a special function called INTDOS. It is the
- same as using Int 21h.
-
- The order of registers in the array is arbitrary and language
- dependent. For TurboPascal it is (AX, BX, CX, DX, BP, SI, DI, DS,
- ES, FLAGS). You enter values in the registers specified by the
- interrupt, and then call the interrupt. The routine does the
- rest.
-
- INTR ( int_no: byte, var the_regs: Registers)
-
- This will push the interrupt number, then the array address. On
- entry to the interrupt call and after initializing BP, we will
- have:
- int_no bp + 6
- array_address bp + 4
- old IP bp + 2
- bp -> old BP bp
-
- What follows is not the exact code, but is similar to what the
- Pascal routine does:
-
- ; - - - - - - - - - -
- intr proc near
-
- push bp
- mov bp, sp
- push ax ; save all registers except SP, DS, SS, CS
- push bx
- push cx
- push dx
- push si
- push di
- push bp ; this is OUR bp
- push es
-
- ; insert the interrupt number in the interrupt
- mov al, [bp+6] ; AL now contains the interrupt number
- lea si, interrupt_spot ; where the interrupt is
- mov cs:[si+1], al ; insert it in the interrupt
-
- ; change all the registers
- mov si, [bp+4] ; array address is DS:SI
- mov ax, [si]
- mov bx, [si+2]
- mov cx, [si+4]
- mov dx, [si+6]
- mov bp, [si+8]
-
-
-
-
- Chapter 25 - What Does It All Mean? 271
- ___________________________________
-
- mov di, [si+12]
- mov es, [si+16]
-
- ; special manipulation for DS and SI
- push ds ; save ds
- push si ; save si
- push ax ; temp save of ax from array
- mov ax, [si+14] ; ds from array to ax
- mov si, [si+10] ; si from array to si
- mov ds, ax ; now move ax to ds
- pop ax ; restore ax
-
- ; call the interrupt
- interupt_spot:
- int 0 ; dummy number for the interrupt
-
- ; special needs for SI and DS
- ; our SI and DS are at the top of the stack
- ; save values of flags, si and ds from interrupt
- pushf ; value from interrupt
- push si ; value from interrupt
- push ds ; value from interrupt
- add sp, 6 ; get to our si and ds
- pop si ; our si
- pop ds ; our ds
- sub sp, 10 ; sp is where it was a moment ago.
- mov [si], ax ; DS:SI points to array
- mov [si+2], bx
- mov [si+4], cx
- mov [si+6], dx
- mov [si+8], bp
- mov [si+12], di
- mov [si+16], es
- pop [si+14] ; ds from the interrupt
- pop [si+10] ; si from the interrupt
- pop [si+18] ; flags from the interrupt
- add sp, 4 ; skip our DS and SI (already in regs)
-
- pop es
- pop bp ; this is OUR bp
- pop di
- pop si
- pop dx
- pop cx
- pop bx
- pop ax
-
- mov sp, bp
- pop bp
-
- ret (4) ; clear arguments off the stack
- ; - - - - - - - - - - - - - - - - - - - -
-
- This should test your insight into using code. DS and SI are
- needed for moving data, so we use some kludges to get it to work.
-
- There are two things here that you shouldn't normally do. First,
-
-
-
-
- The PC Assembler Tutor 272
- ______________________
-
- we are inserting the interrupt number directly in the machine
- code. Secondly, we are playing around with the value of SP. These
- are rare exceptions and shouldn't occur in your own code unless
- absolutely necessary.
-
- The first thing you are going to say is, "Gosh, that's a lot of
- code for one interrupt." True, especially when the interrupt is
- interrupt 12h. Here's int 12h inside of our template file:
-
- ; - - - - - START CODE HERE
- int 12h ; machine memory (return in ax)
- call print unsigned
- ; - - - - - END CODE HERE
-
- It finds out how much memory your computer has and returns the
- number of kbytes in AX. But how much extra time does using this
- Pascal interrupt routine take? About 700 clocks or about .0002
- seconds (that's right, 2 ten thousandths) on the slowest machine.
- How many times will you call it during a program? Only one time.
- There is no point in going down to the assembler level to write a
- program that saves you .0002 seconds. In Pascal, you would write:
-
- INTR ( $12 , the_regs) ;
-
- and be done with it. No big loss of time and no trouble at all.
-
-
- In fact, as far as I can see, there is no reason for doing any
- interrupts from the assembler level. You may want to do a whole
- subprogram that contains interrupts, but if you just need one or
- two interrupts, it is easier to work from inside the high-level
- language.
-
- This includes the i/o we were talking about a minute ago. Yoy can
- write a screen program inside a high level language using arrays.
- Just think of a screen as a 80X25 array. If a two dimensional
- array is too slow you need to go to a one dimensional array. All
- interrupts that tell what kind of video card is in the computer,
- what mode the screen is in, etc. can be done from the high-level
- language. The most you need assembler for (depending on the
- language) is moving the text array into video memory. You want a
- bunch of help screens? Put all the help screens in a single file
- and use the interrupt for random access file read to read a
- screen when you need it.{2}
-
- Anything else? Yes, we still have the need for speed. There are
- certain types of operations like block moves of data, word
- searches and sorting of arrays that are characterized by large
- amounts of data and/or large amounts of computation. If you think
- you see a way to use registers effectively for one of these
- ____________________
-
- 2. What you actually want to do is have the first block of
- data in the file tell you where each screen is and how long its
- data is. Then the first 2 bytes or words of the screen data
- should say the dimensions of the screen data ( 12 X 25, 17 X 3,
- etc.). This will allow you to store and use screens of any size.
-
-
-
-
- Chapter 25 - What Does It All Mean? 273
- ___________________________________
-
- things, you probably can beat a compiled version of the
- subprogram. Then the only question is whether or not it is worth
- the trouble.
-
- We have used the words "fast" and "slow" ambiguously so far, but
- now it is time to quantify them. Before you get the numbers, you
- need to know one thing about memory. People always talk about the
- "data bus". What is it? It is a group of wires connecting the
- 80x86 chip to memory. The 8088 has 8 wires, the 8086, 80286 and
- 80386/SX have 16 wires, and the 80386 has 32 wires. That means
- that the 8088 can transfer 8 bits of information at one time, the
- 8086 et. al. can transfer 16 bits at a time and the 80386 can
- transfer 32 bits at a time. This means one byte, two byte and
- four byte transfers respectively. This also means that the memory
- bytes are ordered a little differently. You will never notice it
- externally, but here is the different internal ordering.
-
- The 8088 has all bytes one after the other. All memory
- read/writes are done with the same 8 wires:
-
- 8088
- MEMORY ADDRESSES
-
- 00005
- 00004
- 00003
- 00002
- 00001
- 00000
- data lines |||||||| (8 bits)
-
- (All our examples will use absolute memory locations starting at
- 00000). The chips with a 16 bit data bus have all the even
- locations on the first 8 wires and the odd locations on the other
- 8 wires. They come in pairs - first even then odd:
-
- 8086
- MEMORY ADDRESSES
-
- 00006 00007
- 00004 00005
- 00002 00003
- 00000 00001
- data lines |||||||| |||||||| (16 bits)
-
- When one of these chips reads or writes, it can read/write either
- the left or the right byte or the whole word. What it cannot do
- is read the right byte from one pair along with the left byte
- from another pair. If you want to read the word at 00005:00006,
- the 8086 must:
-
- 1) read the 00005 byte.
- 2) read the 00006 byte.
- 3) join them together.
-
- This takes longer than just a single word read.
-
-
-
-
-
- The PC Assembler Tutor 274
- ______________________
-
- The true 80386 has a 32 bit data bus. This allows it to read 4
- bytes at a time, and its physical memory structure looks like
- this:
-
- 80386
- MEMORY ADDRESSES
-
- 00010 00011 00012 00013
- 0000C 0000D 0000E 0000F
- 00008 00009 0000A 0000B
- 00004 00005 00006 00007
- 00000 00001 00002 00003
- data lines |||||||| |||||||| |||||||| |||||||| (32 bits)
-
- Instead of memory pairs, we now have memory quadruplets. As long
- as a word is totally inside of one quadruplet, the read/write
- time will be unaffected. If the read/write crosses the boundary
- (as we did above), the read/write time will be affected in the
- same way. The 80386 can also read 4 byte data quickly as long as
- the total data is inside of one memory quadruplet.
-
- In the 8086 family, data can always be read across these
- boundaries but it takes more time. (On the IBM 370, on the other
- hand, there are instructions that REQUIRE that data be aligned
- along 32 bit boundaries).
-
- This means you should order your data in the following way in the
- data segment:
-
- QWORD DATA
- DWORD DATA
- TBYTE DATA ; this is for the 8087
- WORD DATA
- BYTE DATA ; all strings, etc.
-
- This insures that any read/write for that type of data will
- always be as fast as possible. If the segment definition has no
- alignment type, it will start on a paragraph boundary - i.e.
- every 16 bytes, and will work with anything. {3}
-
- In addition, if you ever subtract a number from SP to provide for
- a temporary data area, it should always be an even number. If SP
- is at an odd address instead of an even address, it takes longer
- for PUSHes and POPs. Also, when you define the size of the stack
- segment, it should be an even number of bytes.
-
- Having said that, it is now time for you to see the speeds of
- ____________________
-
- 3. The alignment type is a word after the word SEGMENT which
- says how the segment should be aligned. The following:
-
- DATASTUFF SEGMENT BYTE PUBLIC 'DATA'
-
- says the segment can be aligned at any byte. The allowable forms
- are BYTE, WORD, DWORD, PARA, PAGE (256 bytes). If there is no
- explicit type, the default is PARA.
-
-
-
-
- Chapter 25 - What Does It All Mean? 275
- ___________________________________
-
- instructions. Read the introduction to APPENDIX III, then glance
- at the times to get the general idea of how fast times are. Come
- back to this chapter when you are comfortable with what the times
- look like.
-
- Have you read APPENDIX III? If not, do it before going on.
-
- The compiled languages all have one thing in common. They tell
- you that if you are writing a subroutine, you need to return from
- the subroutine with DS, BP, SS, and SP unchanged. They don't say
- a thing about any of the other registers. One thing this tells us
- is that they are doing everything from memory locations, not
- register locations. If you have taken a good look at the
- execution times, you will have noticed the phenomenal difference
- in time between a "memory, register" addition and a "register,
- register" addition.
-
- Now, if all you are going to do is move a number to a register,
- add it, and move it out again, a compiler can do it as fast as
- you can. But, if you run into a situation where you can use three
- or four registers at the same time, you can cut the execution
- time drastically. Compilers really can't use registers as
- efficiently as we can (yet). This is an ideal spot for using
- assembly language.
-
- The old adage that 10% of the code uses 90% of the computer time
- is appropriate here. You now know about assembler language, and
- you know what you want to do with it, so go out and enjoy. But
- before you do, try to slog your way through the next chapter on
- "simplified" segment definitions and linking to high level
- languages.
-
-